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Abstract 

In  this  paper,  we  present  a  framework  for  a  mixed  estimation  scheme  for  hidden  Markov  models 
(HMM).  A  robust  estimation  scheme  is  hrst  presented  using  the  minimax  method  that  minimizes  a  worst 
case  cost  for  HMMs  with  bounded  uncertainties.  Then  we  present  a  mixed  estimation  scheme  that 
minimizes  a  risk-neutral  cost  with  a  constraint  on  the  worst-case  cost.  Some  simulation  results  are  also 
presented  to  compare  these  different  estimation  schemes  in  cases  of  uncertainties  in  the  noise  model. 

1  Introduction 

A  hidden  Markov  model  (HMM)  is  a  stochastic  process  that  usually  consists  of  a  state  process  that  is  a 
hnite-state  Markov  chain  and  an  observation  or  measurement  process  that  is  a  function  of  the  state  process 
corrupted  by  noise.  The  observation  process  can  be  discrete-range  or  continuous-range.  In  this  paper,  we 
will  be  interested  in  discrete-time  homogeneous  Markov  chains  taking  values  in  a  hnite-dimensional  state 
space  and  observation  processes  that  are  continuous-range,  i.e.,  they  are  observed  in  continuous-range 
noise.  Precise  details  about  our  signal  model  will  be  given  in  the  next  section.  In  short,  we  are  interested 
in  developing  a  robust  estimation  algorithm  for  a  class  of  HMMs  with  unknown  but  bounded  uncertainties 
and  a  “mixed”  estimation  problem  that  minimizes  a  quadratic  cost  with  a  constraint  on  a  worst-case  cost 
for  a  class  of  HMMs  with  random  disturbances.  The  background  and  motivation  behind  formulating  and 
solving  such  problems  are  given  below. 

HMMs  are  known  to  be  good  models  of  many  random  nonlinear  physical  processes  and  there  are 
many  applications  of  HMM  signal  processing  in  diverse  areas  like  speech  recognition,  communication 
systems,  biological  signal  processing,  frequency  tracking,  fault  detection  etc.  to  name  a  few.  In  all 
these  applications,  the  basic  algorithm  involves  estimation  of  the  state  and  the  parameters  of  the  Markov 
chain  that  describes  the  state  process.  State  estimation  of  HMM  is  usually  done  by  calculating  the 

‘Research  supported  by  ONR  contract  01-5-28834  under  the  MURI  Center  for  Auditory  and  Acoustics  Research,  by  NSF  grant 
01-5-23422  and  by  the  Lockheed  Martin  Chair  in  Systems  Engineering. 

^Institute  for  Systems  Research,  University  of  Maryland,  College  Park,  MD  20742 

institute  for  Systems  Research,  Department  of  Electrical  Engineering,  University  of  Maryland,  College  Park,  MD  20742 


“forward  variable”  [1]  which  is  essentially  a  conditional  probability  mass  function  of  the  state  given 
the  observations,  which  can  be  calculated  recursively  given  the  initial  state  distribution,  the  transition 
probability  matrix  of  the  Markov  chain,  the  knowledge  of  the  statistics  of  the  measurement  noise  and  the 
observations.  One  can  then  define  a  suitable  state  estimate  (e.g.  the  “MAP”  estimate  or  the  conditional 
mean  estimate)  based  on  this  conditional  distribution  of  the  state.  This  estimate  is  essentially  a  minimum- 
variance  or  a  “risk-neutral”  state  estimate  in  the  sense  that  it  is  not  sensitive  to  uncertainties  in  the  model. 
As  opposed  to  this,  a  class  of  robust  estimation  algorithms,  known  as  “risk-sensitive”  estimation  schemes, 
following  the  ideas  of  risk-sensitive  control  [2]  [3]  [4]  [5]  were  developed  in  [6]  (for  linear  Gaussian 
signal  models)  [7]  (for  a  class  of  nonlinear  signal  models)  and  [8]  (for  hidden  Markov  models).  Risk- 
sensitive  estimation  essentially  minimizes  an  exponential  of  a  quadratic  (or  more  general  convex)  cost 
and  thus  penalizes  the  higher  order  moments  of  the  estimation  error  energy  to  provide  robustness  against 
model  uncertainties.  Recently,  a  more  meaningful  insight  into  the  robustness  offered  by  risk-sensitive 
estimation  has  been  given  in  [9]. 

However,  the  setting  of  risk-sensitive  estimation  schemes  is  stochastic  in  nature  and  in  general,  small 
noise  limit  results  show  that  risk-sensitive  estimation  algorithms  can  be  connected  to  a  deterministic  worst- 
case  noise  estimation  problem  given  from  a  differential  dynamic  game  [10]  [5]  [11]  (i7oo  estimation  for 
linear  Gaussian  systems).  Risk-sensitive  output  feedback  control  problems  for  HMMs  have  been  treated 
in  [12]  [13]  [14]  and  relations  have  been  drawn  to  robust  control  for  finite-state  machines.  In  particular, 
in  [12],  a  deterministic  model  for  uncertainties  is  introduced  leading  to  a  dynamic  game  formulation  of 
the  robust  control  problem.  A  random  perturbation  of  the  deterministic  system  is  treated  as  an  HMM  and 
the  stochastic  control  problem  for  this  HMM  is  shown  to  be  related  to  the  dynamic  game  problem  for 
the  deterministic  model  using  small  noise  limits.  However,  in  the  general  framework  of  [12],  no  specific 
choices  of  the  cost  functions  associated  with  the  disturbances  are  given.  In  addition,  depending  on  the 
nature  of  the  disturbances  (often  a  mixture  of  random  and  unknown  but  bounded  disturbances  [15]),  it  is 
often  necessary  to  introduce  a  trade-off  between  the  risk-neutral  and  risk-sensitive  or  robust  estimation 
objectives.  One  way  to  do  this  is  to  introduce  a  “mixed”  criterion  where  a  risk-neutral  cost  is  minimized 
subject  to  a  constraint  on  the  worst-case  cost.  A  mixed  risk-neutral  and  minimax  control  problem  is 
solved  for  HMMs  in  [16]. 

In  our  paper,  we  formulate  a  robust  estimation  problem  for  a  hidden  Markov  model  with  unknown  but 
bounded  uncertainties.  Following  ideas  similar  to  [12],  we  set  up  a  dynamic  game  problem  for  the  robust 
estimation  scheme  with  appropriate  choices  for  the  cost  functions  associated  with  the  disturbances  in  the 
state  (reflected  by  bounded  variations  of  the  transition  probability  matrix)  and  the  observation  process 
(reflected  by  the  additive  continuous-range  independent  bounded  noise)  and  the  initial  distribution  of 
the  state.  The  objective  of  the  robust  estimation  problem  is  to  obtain  state  estimates  which  minimize  a 
worst-case  cost  over  a  finite  horizon  when  the  estimates  are  constrained  to  the  vector  space  of  unit  vectors. 
Next,  we  extend  the  ideas  of  [16]  to  set  up  a  “mixed”  estimation  problem  that  minimizes  a  risk-neutral 
or  quadratic  cost  subject  to  a  constraint  satisfied  by  the  worst-case  cost  described  before.  Simulation 
results  show  that  in  the  event  of  bounded  disturbances  being  present,  minimax  estimation  outperforms 


risk-neutral  estimation  and  mixed  estimation  guarantees  the  worst  case  cost  to  be  constrained  where  as 
risk-neutral  estimation  does  not.  We  also  compare  the  performance  of  minimax  estimation  with  that  of 
risk-sensitive  estimation. 

In  Section  2,  we  describe  the  signal  model,  and  give  precise  statements  regarding  the  problem 
objectives  of  robust  (minimax)  and  mixed  estimation.  Section  3  and  4  detail  the  algorithms  for  the  two 
estimation  problems  using  forward  dynamic  programming  approach.  Section  5  presents  some  simulation 
results  while  concluding  remarks  are  given  in  Section  6. 

2  Signal  Model 

Consider  a  probability  space  (fi,  T ,  V)  where  is  a  discrete-time  homogeneous  Markov  chain  belong¬ 
ing  to  a  finite-discrete  set.  Define  E  =  {ei ,  62, . . . ,  bat}  where  e,  =  (0, . . . ,  0, 1 , 0, . . . ,  0)'  €  IR^  with 
1  in  the  i-th  position.  Without  loss  of  generality,  we  can  assume  that  €  E.  Denote  the  transition 
probability  matrix  as  A  =  (oy )  where  Oy  =  P{Xi.j^\  =  e,  |  Xj^  =  Cj).  We  assume  that  there  exists  an 
e  >  0  such  that  a  ij  >  e.  Also,  Oy  =  1,  Vj. 

We  observe  a  process  yi-  €  IR^  such  that 

yk  =  H{Xu)  +  Vk  (1) 

where  vu  G  IR^ ,  A:  G  IN  is  the  disturbance  in  the  measurement  process  that  may  be  random  with  known 
statistical  information  or  unknown  but  bounded  in  L2  with  probability  one,  depending  on  the  nature  of  our 
estimation  problem.  Define  {3^^;}  =  ■••  ,?/*)■  In  case  the  disturbances  are  purely  random,  one 

can  define  {3^*}  to  be  the  complete  filtration  generated  by  (j{ya,  yi, . . .  ,yk}.  In  the  following  sections, 
we  will  be  using  the  notation  {3^*}  with  their  appropriate  definitions  relative  to  the  context,  without 
reiterating  the  definitions  separately  for  each  context. 

Also,  define  ttq  G  1R^  to  be  the  initial  probability  distribution  of  the  Markov  chain,  such  that 
P{Xq  =  Ci)  =  7ro(i).  We  assume  that  there  exists  a  <5  >  0  such  that  Tro{i)  >  S,  'ii.  Obviously, 
Ef.i^o(*)  =  i. 

3  Minimax  state  estimation  for  bounded  uncertainties 

In  this  section,  we  assume  that  vu ,  defined  in  the  previous  section  is  unknown  but  bounded  in  L2  with 
probability  I .  Also,  uncertainties  in  the  A  matrix  and  the  initial  probability  distribution  vector  tTo  are 
assumed  to  be  such  that  the  assumptions  made  earlier  on  the  elements  of  A  and  tto  still  hold. 

Consider  a  specific  state  sequence  Xq  =  =  e,, , . . . ,  X^  =  64  and  an  observation  sequence 

{yi}^  I  =  0, 1, . . .  ,A:.  Define  =  LX^,  L  €  and  our  objective  is  to  obtain  an  estimate 

Zk  =  LXk  (Xk  G  E)  of  Zk  as  a  Borel  measurable  function  of  {3^*},  k  >  0  such  that  the  following 
worst  case  cost  is  minimized: 

(t>{ei,,Xi)  -  /3(ey)  - 


1 


1)  +  '^Viyheu) 


where  and  V  ;  R^  x  f  R  with  the  following  properties, 

<f>{eii,Xi)  >  0,  Vi/,  V/,  00  >  /3(ej„)  >0,  Vio  G  {1,2, . .  ■,N},  oo  >  17(ej, ,  >  0,  Vi/,i/+i,  V/ 

and  00  >  J  >  V{yi,eii)  >0,  Vi/,  VL  Also,  /r  >  0.  We  also  make  the  assumptions  that  the  above 
mentioned  cost  functions  are  infinite  valued  if  any  of  their  arguments  do  not  belong  to  their  respective 
domain  spaces. 

In  other  words,  we  find  {3^*  {-adapted  X/^,  k  >0  such  that 


Xk  =  argmin  max 

io,---,ik 


k-l  ^  j  Ck-\  k 

,Xi)+  4>{ei^ ,  0  -  /3(e*„)  -  -  \  ) 

./=o  ^  u=o  /=0 


(3) 


Remark  3.1  Note  above  that  at  each  time  k,  we  only  obtain  X/^,  and  do  not  obtain  new  values  for 
Xi,  I  <  k.  In  other  words,  this  is  a  strict  filtering  problem. 


Now,  we  make  specific  choices  of  the  cost  functions  /3(),  U{.,.)  and  V Denoting 


L'  L,  we  make  the  following  choices: 

(f>{x,x)  = 

-{x  —  x)'Q{x  —  x) 

= 

-IninoU)) 

II 

-ln(aji) 

V{yk,ej)  = 

\\yk-H{e,)\\^ 

(4) 

Here  1 1 . 1 1  denotes  the  Eucledian  distance  between  two  vectors.  Also,  note  that  the  above  cost  functions 
do  satisfy  the  assumptions  we  made  earlier. 

With  these  specific  choices,  now  we  can  define  the  following  information  state: 


Definition  3.1  Define  the  information  state  Sk{j),  k  >  1,  j  €{1,2,...,  N}  as 


Sk{j) 

so(i) 


max  \Sk-i{i)  +  -  Xk-ifQiei  -  X^-i)  -  ^{-ln{aji)  +  \\yk-i  -  H{ei)\f-) 

ln{Tto{i)),  i  e  {1,2,. . .  ,N} 


(5) 


The  minimax  state  estimate  is  given  by  the  following  theorem: 

Theorem  3.1  Consider  the  HMM  signal  model  defined  in  Section  2  and  the  minimax  dynamic  game 
problem  defined  by  (3)  and  (4).  Then  the  minimax  state  estimate  is  given  by 


Xq  =  Ci* ,  i*  =  argmin  max 
/  * 

Xk  =  Cj* ,  j*  =  argmin  max 


so(i)  -b  ^(e*  -  ei)'Q{ei  -  e/)  “  ^llvo  -  H{ei)\\^ 
Sk{i)  +  -  ei)'Q{ei  -  e/)  -  ^\\yk  -  H{ei)\\^ 


(6) 


Proof  The  proof  is  straightforward  once  we  use  the  method  of  forward  dynamic  programming  and  the 
definition  of  the  information  state  3.1.  Substituting  this  in  (3),  we  can  obtain  (6).  □ 


Remark  3.2  Note  that,  in  the  case  of  minimax  control  as  in  [16],  one  needs  to  use  a  backward  dynamic 
programming  and  the  concept  of  a  value  function,  but  the  necessity  of  using  such  tools  does  not  arise  in 
the  strict  filtering  problem  mentioned  above. 


4  Mixed  estimation 


In  this  section,  we  formulate  a  mixed  estimation  problem  for  the  HMM  defined  in  Section  2.  We  briefly 
recapitulate  the  risk-neutral  and  the  risk-sensitive  state  estimation  algorithms.  Then  we  present  a  solution 
to  the  mixed  estimation  problem. 

For  the  purpose  of  this  section,  we  return  to  the  usual  stochastic  framework  of  the  HMM  observation 
model  (1).  We  assume  that  {vu}  is  a  sequence  of  i.i.d.  random  variables.  In  this  paper,  we  assume  that 
Vk  ~  ^(0,  Ji),  yk.  Recall  from  [1]  that  the  risk-neutral  estimate  (which  essentially  is  the  conditional 
mean  estimate)  for  the  state  of  the  Markov  chain  is  given  by  E[Xk  \  3^*].  This  is  obtained  from  the 
unnormalized  measure  (denoted  by  ak  G  IR^)  which  can  be  computed  recursively  as  follows: 

N 

c^kU)  =  E[<  Xk,ej  >1  yk]  =  bj{yk)'^ajiak-i{i),  ao{i)  =  6*(t/o)7ro(*)  (7) 

i—1 


Or,  in  matrix  notation. 


ak  =  B{yk)Aak-i,  ao  =  B{yo)TrQ 


(8) 


Here,B(y*)  =  diag{bi{yk),  ■  ■  ■  ,bN{yk)),bi{yk)  =  “  Hia))'!.  \yk  -  77(e*))}. 

The  above  unnormalized  estimate  can  be  normalized  to  yield 


P{Xk  =  Cj  I  yk)  =  E[<  Xk,ej  >\yk]  = 

ak(j) 


(9) 


Risk-sensitive  estimation  for  HMMs 


The  risk-sensitive  state  estimate  X^^of  a  hidden  Markov  model  is  discussed  in  detail  in  [8].  We  quote 
the  main  results  here.  The  risk-sensitive  cost  for  an  HMM  described  in  Section  2  is  given  by 

'k-l 


xr  = 


=  argmin  E[expd  {Y.(X,  -  Xn'QiX,  - 


Ces 


.  /=o 


(10) 


Here,  6  >  0  plays  a  similar  role  as  ji  in  (2). 

One  then  defines  a  new  measure  P,  under  which  {yk}  is  a  sequence  of  i.i.d.  random  variables  with 
density  N{Q,  Z).  The  corresponding  Radon-Nikodym  derivative  is  given  by 

_  A  aivk  -  HjXk)) 
dP  ''  givk) 

where  g  =  N{Q,  Z).  Denoting  the  expectation  under  P  as  E,  and 

'i'o,k  =  \j2(^i-xryQ{Xi-xn 

^  /=0 


we  can  define  the  following  unnormalized  information  state: 


qdj)  =  E[Ak-iexp{0'¥o^k-i)  <  Xk,ej  >\  yk-i],  j  G  {1, 2, . . . , 
It  can  be  shown  that  the  information  state  obeys  the  following  recursion 

Qk+i  =  ADkB{yk)qk 


(11) 


(12) 


where 

Dk  =  diag  jexp  Q(ei  -  lr)'Q(ei  “  ,  •  •  •  ,exp  -  X^n'QieN  -  Id)  } 

and  the  optimal  risk-sensitive  state  estimate  is  given  by 

xr  =  e™. 

TO*  =  argmin  V  exp  (^{ej  -  em)'Q{ej  -  qk{i)  (13) 

9\yk>  ) 

With  this  brief  recapitulation  of  risk-neutral  and  risk-sensitive  estimation  for  HMMs,  we  now  dehne 
the  mixed  estimation  problem. 

The  objective  of  the  mixed  estimation  problem  for  the  HMM  described  in  Section  2.  with  Vk  in  (1) 
random  as  described  in  the  beginning  of  this  section,  is  to  hnd 


=  2.rgxAmE\{Xk  -  g)'Q{Xk  -  v)  \  yk],  k>0 

Ties 

such  that  the  following  constraint  is  satished  by  the  worst  case  cost: 


(14) 


max 

«0, 


/=o 


j  (k-i  k 

I3(ei„)  +  -  I  +  '^V(yi,ei,) 

^  I.  1^0 


/=0 


<0,  k>Q  (15) 


In  the  next  subsection,  we  present  the  solution  to  the  mixed  estimation  problem. 


Solution  to  the  mixed  estimation  problem  for  HMMs 

Dehne  £k  C  £,  A:  >  0  to  be  {e™  :  maxj  ^Sk{j)  +  j{ej  -  CmYQiej  -  e^)  -  j[V{yk,ej)^  <  0). 
where  Sk (j)  is  as  dehned  in  (5)  and  1^(., .)  is  as  dehned  in  (4). 

Then,  the  state  estimate  for  the  mixed  estimation  problem  is  given  by  the  following  theorem: 

Theorem  4.1  Consider  the  HMM  signal  model  defined  in  Section  2  and  the  mixed  estimation  objective 
defined  by  (14),  (15).  Then  the  state  estimate  for  the  mixed  estimation  problem  is  given  by 

N 

=  argmin  V(ei  -r]yQ(ei  -  r])ak{i)  (16) 

vesu 


Proof  The  proof  is  rather  straightforward  once  we  note  that  £k  just  denotes  the  admissible  set  for  the 
state  estimates  such  that  the  constraint  on  the  worst  case  cost  (15)  is  satished.  One  then  applies  (9)  to 
obtain  (16).  □ 


5  Simulation  results 


In  this  section,  we  present  some  simulation  results  to  demonstrate  the  differences  among  these  different 
estimation  methods,  i.e.,  minimax,  risk-sensitive  and  mixed  estimation  methods  for  a  given  HMM.  The 
HMM  under  investigation  has  10  states  with  the  following  A  where  an  =  0.19,  =  0.09,  Vi  ^ 

j,  i,j  G  {1,2,...,  10}.  The  observation  model  is  scalar  where  Vk  is  Gaussian  distributed.  Note  that 
when  the  measurement  disturbance  is  bounded,  it  makes  sense  to  use  the  robust  estimation  method  in 
the  minimax  sense  as  presented  in  Section  3.  Risk-sensitive  or  risk-neutral  methods  become  suboptimal 
in  that  case.  However,  we  ran  some  simulations  with  Vk  being  a  truncated  Gaussian  noise  such  that 
\vk  I  <  5a,  where  a  is  the  standard  deviation  of  the  Gaussian  distribution.  The  observed  signal  i.e., 
H(xk)  is  H'Xi^  where  H  =  (12345678910)'.  The  performance  criterion  is  the  average  squared  error 
T  —  Xi-yQ{Xi-  —  X)-)  where  X)-  represents  the  risk-neutral,  risk-sensitive  or  the  minimax 

state  estimate,  as  the  case  may  be  (with  some  abuse  of  notation)  and  Q  =  HH' . 

Figure  1  shows  how  the  suboptimal  risk-sensitive  filter  performs  with  different  values  of  0.  Figure 
2  shows  the  performance  of  the  minimax  robust  estimate  against  various  values  of  ji.  The  risk-neutral 
average  error  was  found  to  be  3.3555  over  a  run  of  T  =  1000  time  points. 

We  also  simulated  the  mixed  estimation  algorithm.  Note  that  when  Vk  is  purely  random  and  the 
statistical  information  about  Vk  is  accurately  known,  the  error  performance  achieved  by  the  mixed 
algorithm  is  lower  bounded  by  that  of  the  risk-neutral  estimation,  since  the  mixed  estimation  optimizes 
over  a  constrained  state  space  whereas  the  risk-neutral  algorithm  optimizes  over  the  complete  state  space 
of  which  the  constrained  state  space  is  only  a  subset.  However,  when  Vk  contains  a  mixture  of  random 
and  unknown  but  bounded  noise  or  the  statistical  information  about  the  noise  is  not  accurately  known,  we 
observed  that  the  constraint  on  the  worst  case  cost  ( 1 5)  is  repeatedly  violated  by  the  risk-neutral  estimation 
scheme  whereas  the  mixed  estimation  scheme  always  satisfies  the  constraint.  We  also  observed  similar 
results  when  the  noise  was  generated  according  to  a  uniform  distribution  but  was  assumed  to  be  Gaussian 
instead.  Note  that  both  the  risk-neutral  and  the  mixed  estimation  schemes  become  suboptimal  for  such 
uncertainties.  However,  the  mixed  estimation  scheme  guarantees  an  upper  bound  on  the  worst  case  cost 
whereas  the  risk-neutral  estimation  scheme  fails  to  do  so.  We  do  not  present  any  numerical  results  here 
for  obvious  reasons. 


6  Conclusions 

We  addressed  the  problem  of  robust  state  estimation  for  hidden  Markov  models  in  this  paper.  We 
introduced  a  minimax  robust  estimation  problem  for  HMMs  with  bounded  uncertainties  and  presented  a 
solution  to  this  problem  using  the  techniques  of  information  states  and  forward  dynamic  programming 
methods.  We  also  solve  a  mixed  estimation  problem  that  optimizes  a  quadratic  cost  with  a  constraint 
on  the  worst  case  cost.  Some  simulation  results  are  presented  that  compare  the  performances  of  these 
different  methods  in  case  of  uncertainties  in  the  noise  model. 

Please  note  also  that  we  have  not  addressed  the  problem  of  parameter  estimation  for  HMMs  in  case 


of  bounded  uncertainties.  This  is  a  problem  currently  under  investigation  with  a  possible  generalization 
being  combined  robust  state  and  parameter  estimation  for  more  general  signal  models. 
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